Mining Large-Scale Retail Data

نویسنده

  • Vincent Leroy
چکیده

Understanding customer buying patterns is of great interest in the retail industry. Applications include targeted advertising, optimized product placement, and cross-promotions. Association rules, expressed as A → B (if A then B) are a common and easily understandable ways to represent buying patterns. While the problem of mining such rules has received considerable attention over the past years, most of the approaches proposed have only be evaluated on relatively small datasets, and struggle at large scale. In the context of the Datalyse project 1, Intermarché, our industrial partner, has given us access to 2 years of sales data: 3.5B sales records, 300M tickets, 9M customers, 200k products. This constitutes an opportunity to re-visit the problem association rules mining in the context of “big data”. In the remainder of this paper, I will first give an overview of our work on designing mining algorithms adapted to long-tailed datasets. Then, I will describe our evaluation of quality measures for ranking association rules. Finally, I will present the systems architecture deployed to apply mining in production at Intermarché.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Scalable Bottum-Up Data Mining Algorithm for Relational Databases

Machine learning induction algorithms are difficult to scale to very large databases because of their memory-bound nature. Using virtual memory results to a significant performance degradation. To overcome such shortcomings, we developed a classification rule induction algorithm for relational databases. Our algorithm uses a bottom-up rule generation strategy that is more effective for mining d...

متن کامل

A Scalable Bottom-Up Data Mining Algorithm for Relational Databases

Machine learning induction algorithms are difficult to scale to very large databases because of their memory-bound nature. Using virtual memory results to a significant performance degradation. To overcome such shortcomings, we developed a classification rule induction algorithm for relational databases. Our algorithm uses a bottom-up rule generation strategy that is more effective for mining d...

متن کامل

Using Association Rule Mining for Extracting Product Sales Patterns in Retail Store Transactions

Computers and software play an integral part in the working of businesses and organisations. An immense amount of data is generated with the use of software. These large datasets need to be analysed for useful information that would benefit organisations, businesses and individuals by supporting decision making and providing valuable knowledge. Data mining is an approach that aids in fulfilling...

متن کامل

Generating Customer Profiles for Retail Stores Using Clustering Techniques

The retail industry collects huge amounts of data on sales, customer buying history, goods transportation, consumption, and service. With increased availability and ease of use of modern computing technology and e-commerce, the availability and popularity of such businesses has grown rapidly. Many retail stores have websites where customers can make online purchases. These factors have resulted...

متن کامل

Analysis of Customers' Spatial Distribution Through Transaction Datasets

Understanding people’s consumption behavior while traveling between retail shops is essential for successful urban planning as well as deter‐ mining an optimized location for an individual shop. Analyzing customer mobi‐ lity and deducing their spatial distribution help not only to improve retail marketing strategies, but also to increase the attractiveness of the district through the appropriat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016